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Abstract. Tree Regular Model Checking (TRMC) is the name of a fam- 
ily of techniques for analyzing infinite-state systems in which states are 
represented by terms, and sets of states by Tree Automata (TA). The 
central problem in TRMC is to decide whether a set of bad states 
is reachable. The problem of computing a TA representing (an over- 
~* ■ approximation of) the set of reachable states is undecidable, but efficient 

solutions based on completion or iteration of tree transducers exist. 
Unfortunately, the TRMC framework is unable to efficiently capture both 

r — ' the complex structure of a system and of some of its features. As an 

example, for JAVA programs, the structure of a term is mainly exploited 
to capture the structure of a state of the system. On the counter part, 
integers of the Java programs have to be encoded with Peano numbers, 

\-±4 , which means that any algebraic operation is potentially represented by 

C/3 ■ thousands of applications of rewriting rules. 

O ' In this paper, we propose Lattice Tree Automata (LTAs), an extended 

version of tree automata whose leaves are equipped with lattices. LTAs 
allow us to represent possibly infinite sets of interpreted terms. Such 
terms are capable to represent complex domains and related operations in 

I/-) ■ an efficient manner. We also extend classical Boolean operations to LTAs. 

^\ ' Finally, as a major contribution, we introduce a new completion-based 

algorithm for computing the possibly infinite set of reachable interpreted 
terms in a finite amount of time. 

en 

o 

1 Introduction 

Infinite-state models are often used to avoid potentially artificial assumptions 

KJt • on data structures and architectures, e.g. an artificial bound on the size of a 

'j_j \ stack or on the value of an integer variable. At the heart of most of the tech- 

d ■ niques that have been proposed for exploring infinite state spaces, is a symbolic 

representation that can finitely represent infinite sets of states. 

In early work on the subject, this representation was domain specific, for 
example linear constraints for sets of real vectors [25] . For several years now, 
the idea that a generic automata-based representation for sets of states could 
be used in many settings has gained ground starting with finite-word au- 
tomata [101111250] , and then moving to the more general setting of Tree Regular 
Model Checking (TRMC) [111313] . In TRMC, states are represented by trees, set 



of states by tree automata, and behavior of the system by rewriting rules or 
tree transducers. Contrary to specific approaches, TRMC is generic and expres- 
sive enough to describe a broad class of communication protocols [3] , various C 
programs [12 with complex data structures, multi-threaded programs, and even 
cryptographic protocols |22I6| . Any Tree Regular Model Checking approach is 
equipped with an acceleration algorithm to compute possibly infinite sets of 
states in a finite amount of time. Among such algorithms, one finds completion 
by equational abstraction [?] that computes successive automata obtained by ap- 
plication of the rewriting rules, and merge intermediary states according to an 
equivalence relation to enforce the termination of the process. 

In [S] , the authors proposed an exact translation of the semantic of the Java 
Virtual Machine to tree automata and rewriting rules. This translation permits 
to analyze Java programs with classical Tree Regular Model checkers. One of the 
major difficulties of this encoding is to capture and handle the two-side infinite 
dimension that can arise in Java programs. Indeed, in such models, infinite be- 
haviors may be due to unbounded calls to method and object creation, or simply 
because the program is manipulating unbounded data such as integer variables. 
While multiple infinite behaviors can be over-approximated with completion and 
equational abstraction [?] , their combinations may require the use of artificially 
large-size structures. As an example in 9], the structure of a configuration is rep- 
resented in a very concise manner as the structure of terms is mainly designed 
to efficiently capture program counters, stacks, .... On the other hand, integers 
and their related operations have to be encoded in Peano arithmetic, which has 
an exponential impact on the size of automata representing sets of states as well 
as on the computation process. As an example, the addition of x to y requires 
the application of x rewriting rules. 

A solution to the above problem would be to follow the solution of Kaplan 24 , 
and represent integers in bases greater or equal to 2, and the operations between 
them in the alphabet of the term directly. In such a case, the term could be in- 
terpreted and returns directly the result of the operation without applying any 
rewriting rule. The study of new Tree Regular Model Checking approaches for 
such interpreted terms is the main objective of this paper. Our first contribution 
is the definition of Lattice Tree Automata (LTA), a new class of tree automata 
that is capable of representing possibly infinite sets of interpreted terms. Roughly 
speaking, LTA are classical Tree Automata whose leaves may be equipped with 
lattice elements to abstract possibly infinite sets of values. Nodes of LTA can 
either be defined on an uninterpreted alphabet, or represent lattice operations, 
which will allows us to interpreted possibly infinite sets of terms in a finite 
amount of time. We also propose a study of all the classical automata-based 
operations for LTA. The model of LTA is not closed under determinization. In 
such case, the best that can be done is to propose an over-approximation of the 
resulting automaton through abstract interpretation. As a third contribution, 
we propose a new acceleration algorithm to compute the set of reachable states 
of systems whose states are encoded with interpreted terms and sets of states 
with LTA. Our algorithm extends the classical completion approach by consid- 



ering conditional term rewriting systems for lattices. We show that dealing with 
such conditions requires to merge existing completion algorithm with a solver 
for abstract domains. We also propose a new type of equational abstraction for 
lattices, which allows us to enforce termination in a finite amount of time. Fi- 
nally, we show that our algorithm is correct in the sense that it computes an 
over-approximation of the set of reachable states. This latter property is only 
guaranted providing that each completion step is followed by an evaluation oper- 
ation. This operation, which relies on a widening operator, add terms that may 
be lost during the completion step. Finally, we briefly describe how our solution 
can drastically improve the encoding of Java programs in a TRMC environment. 

Related Work This work is inspired by [TH] , where the authors proposed to use 
finite-word lattice automata to solve the Regular Model Checking problem. Our 
major differences are that (1) we work with trees, (2) we propose a more general 
acceleration algorithm, and (3) we do consider operations on lattices while they 
only consider to label traces with lattices without permitting to combine them. 
Some Regular Model Checking approaches can be find in |4llOI5ll3] , However, 
none of them can capture the two infinite-dimensions of complex systems in 
an efficient manner. Other models, like modal automata [8] or data trees [18120) . 
consider infinite alphabets, but do not exploit the lattice structure as in our 
work. Lattice (-valued) automata |26) . whose transitions are labelled by lattice 
elements, map words over a finite alphabet to a lattice value. Similar automata 
may define fuzzy tree languages [TB] . Other verification of particular classes of 
properties of Java programs with interpreted terms can be found in [2 7) . 

2 Backgrounds 

Rewriting Systems and Tree Automata. Let J 7 be a finite set of functional sym- 
bols, where each symbol is associated with an arity, and let X be a countable set 
of variables. T(J~, X) denotes the set of terms and T(jF) denotes the set of ground 
terms (terms without variables). The set of variables of a term t is denoted by 
Var(t). The set of functional symbols of arity n is denoted by T n . A position 
p for a term t is a word over N. The empty sequence e denotes the top-most 
position. We denote by Pos(t) the set of position of a term t. If p £ Vos(t), then 
t\ p denotes the subterm of t at position p and t[s] p denotes the term obtained 
by replacement of the subterm t\ p at position p by the term s. 

A Term Rewriting System (TRS) 1Z is a set of rewrite rules I — > r, where 
l,r e T(T 1 X), and Var(l) 3 Var(r). A rewrite rule Z — > r is left-linear if each 
variable of I occurs only once in I. A TRS 1Z is left-linear if every rewrite rule 
I — >• r of 1Z is left-linear. 

We now define Tree Automata (TA for short) that are used to recognize 
possibly infinite sets of terms. Let Q be a finite set of symbols of arity 0, called 
states, such that Q n J = U. The set of configurations is denoted by T{T U Q). 
A transition is a rewrite rule c — > q, where c is a configuration and q is a 
state. A transition is normalized when c = f(qi, ■ ■ ■ ,q n ), f £ J 7 is of arity n, 



and qi, . . . ,q n E Q. A bottom- up nondeterministic finite tree automaton (tree 
automaton for short) over the alphabet J 7 is a tuple A = (Q,.F, Qf, A), where 
Qf Q Q is the set of final states, A is a set of normalized transitions. 

The transitive and reflexive rewriting relation on T(J- U Q) induced by A is 
denoted by — >* A . The tree language recognized by A in a state q is C(A,q) = 
{t E T(T) I t -^ q}. We define C(A) = U 9£Sf £(4 g). 

Lattices, atomic lattices, Galois connections. A partially ordered set (A, C) is a 
lattice if it admits a smallest element _L and a greatest element T, and if any 
finite set of elements X C A admits a greatest lower bound (gib) r\X and a least 
upper bound (lub) UX . A lattice is complete if the gib and lub operators are 
defined for all possibly infinite subset of A. An element a; of a lattice (A, C.) is an 
atom if it is minimal, i.e. _L d x/\\/y GA:-LCy^x=>y = x. The set of atoms 
of A is denoted by Atoms(A). A lattice (A, C) is atomic if all element x E /l where 
.t 7^ _L is the least upper bound of atoms, i.e. x = \_\{a\a 6 Atoms(A) A a Q x}. 
Considered two lattices (C, Qc) (the concrete domain) and {A, Qa) (the 
abstract domain). We say that there is a Galois connection between the two 
lattices if there are two monotonic functions a : C —5- A and 7 : A — > C such 
that : \/x E C, y £ A, a(x) C^ y if and only if x \Z C j(y). As an example, sets 
of integers (2 Z ,C) can be abstracted by the atomic lattice (A,Q) of intervals, 
whose bounds belong to Z U {— 00, +00}) and whose atoms are of the form [x, x], 
for each i€l Any operation op defined on a concrete domain C can be lifted to 
an operation op& on the corresponding abstract domain A, thanks to the Galois 
connection. 



3 Lattice Tree Automata 

In this section, we first explain how to add elements of a concrete domain into 
terms, which has been defined in [51], and how to derive an abstract domain 
from a concrete one. Then we propose a new type of tree automata recognizing 
terms with elements of a lattice and study its properties. 



3.1 Discussion 

We first discuss the reason for which we chose to consider tree automata with 
leaves that are labelled by elements of an atomic lattice. We remind that the 
main goal of this work is to extend the TRMC approach to tree automata that 
represent sets of interpreted terms. We may assume that the interpreted terms 
of a given set are similar to each other, for example {/(l), /(2), /(3), /(4)}. We 
can encode naively this set of terms by a tree automaton with the transitions : 
1 — > q, 2 — > q, 3 — > q, 4 — > q, f(q) — > qf. This naive encoding is quite inefficient, 
and we would prefer to label the leaves of the tree not by integers, but by a 
set of integers. The new tree automata has only two transitions : {1, 2, 3, 4} — > 

q,f(q) -+Qf- 



This is the reason why we considered the notion of LTA : In there, sets of 
integers is just a particular lattice. By considering tree automata with a generic 
lattice, we can also improve the efficiency of the approach. General sets of integers 
are indeed hard to handle, and we often only need an over-approximation of the 
set of reachable states. That is why we prefer to label the leaves of the tree 
by elements of an abstract lattice A such as the lattice of intervals. The Galois 
connection ensures that the concrete operations (e.g. +, x) on integers have an 
abstract semantics, and that the approximations are sound. 

In order to simplify the notations, we did not emphasize in this paper the 
abstract interpretation aspects. For example, when we say that "the concrete 
domain is V = N, the abstract domain is (A, C)", it really means that the 
concrete lattice is (2 N , C) and that there is a Galois connection with (A, IZ). 
In the examples, we apply implicitely the concretization function, wich is the 
identity (if the abstract lattice is the lattice of intervals). We can also define the 
LTA even when there is no Galois connection between the concrete lattice and 
the abstract one. In this case, the function eval# must be defined so that we 
still have over-approximation of the concrete operations. 

There are two reasons why we consider only atomic abstract lattices, and why 
the language of an LTA is defined on farms built with the atoms rather that 
with any elements of the lattice. The first one is that we are mostly interested in 
representing sets of integers. Since the atoms are the integers, the semantics of a 
lambda transition is to recognize a set of integers. The other reason is a technical 
one : It ensures that when we transform a LTA according to a partition, we do 
not change the recognized language since the set of atoms are preserved by this 
transformation. 



3.2 Interpreted Symbols and Evaluation 

In what follows, elements of a concrete and possibly infinite domain T> will be 
represented by a set of interpreted symbols J 7 ,. The set of symbols is now denoted 
by T = J- U J- m , where J- is the set of passive (uninterpreted) symbols. The set 
of interpreted symbols T m is composed of elements ofDfieDC J 7 ,) whose arity 
is 0, and is also composed of some predefined operations / : T> n — > £>, where 
/ € T n . For example, if V = N, then T, can be NU{+, — , *}. Passive symbols can 
be seen as usual non-interpreted functional operators, and interpreted symbols 
stand for built-in operations on the domain T>. 

The set T(.F») of terms built on F, can be evaluated by using an eval function 
eval : T(J-,) — > T>. The purpose of eval is to simplify a term using the built-in 
operations of the domain T>. The eval function naturaly extends to T(J-) in 
the following way: eval(f(ti, . . . ,t n ) = f(eval(ti), . . . ,eval(t n )) if / G J- or 
3i = 1 . . . n : ti ^ T(J-,)- Otherwise, f(t\, . . . , t n ) <E 7~(.F # ) and the evaluation 
returns an element of T>. 

To deal with infinite alphabets (e.g. R or N), we propose to replace the 
concrete domain T> by an abstract one A, linked to P by a Galois connection. 
Moreover, we assume that (A, C.) is an atomic lattice and that the built-in sym- 



bols are U and n, which arity is 2, and other symbols corresponding to the 
abstraction of F, . 

Let OP be the set of operations op defined on T>, and OP* the set of corre- 
sponding operations op* defined on A, we have that F, = V U OP, and the 
corresponding abstract set is defined by FT = A U OP* U {U, n}. For example, 
let I be the set of intervals with bounds belonging to Z U {- oo,+oo}. The set 
F, = Z U {+, — } can be abstracted by FT — I U {+*, — *, U, n}. Terms contain- 
ing some operators extended to the abstract domain have to be evaluated, like 
explained in section 3.2 for the concrete domain. eval* : FT — >■ yl is the best 
approximation of eval w.r.t. the Galois connection. 

Example 1 (eval* function). For the lattice of intervals on Z, we have that: 

— eval*(i) = i for any interval i, 

— For any / G {+*,— #,U,n} eval* (f '(11,12)) is defined, given eval*(i\) — 
[a,b] and eval* (12) = [c,d], by: eval*([a,b] U [c,d]) — \min(a,c),max(b,d)], 
eval*([a,b) n [c,d]) = [max(a,c),min(b,d)} if max(a,c) < min(b,d), else 
eval*([a, b] n [c, d]) = _L, eval*([a, b] +# [c, d]) = [a + c,b + d], eval* {[a, b] -* 
[c, d]) = [a — d, b — c]. 

3.3 The Lattice Tree Automata Model 

Lattice tree automata are extended tree automata recognizing terms defined on 

F UF*. 

Definition 1 (lattice tree automaton). A bottom-up non- deterministic finite 
tree automaton with lattice (lattice tree automaton for short, LTA) is a tuple A — 
(F = F U FT , Q, Qf,A), where F is a set of passive and interpreted symbols, 
Q and Qf a set of state, Qf C Q, and A is a set of normalized transitions. 

The set of lambda transitions is defined by A a — {A — > q X — > 
q<EAA\^-LA\EA}. The set of ground transitions is the 
set of other transitions of the automaton, and is formally defined by 

a g = {f(qi,---,q n ) ->■ q I f(qi,...,q n ) ->■ qe A a q,qi,...,q n eQ}. 

We extend the partial ordering C (on A) on T(F): 

Definition 2. Given s,t e T(F), s C. t iff (1) s C. t (if both s and t belong to 
A), (2) eval(s) Q eval(t) (if both s and t belong to F(F,)), (3) s — t (if both 
s and t belong to F^), or (4) S = f(si, . . . , s n ), t = f(t\, . . . , t n ), f G F™ and 
81 C.ti A ... As n C t n . 

Example 2. f(g(a,[l,$\) Q f(g(a,[0,8}), and ft([0,4] +# [2,6]) Q h([l,3] +# 
[1,9]). 

In what follows we will omit # when it is clear from the context. We now 
define the transition relation induced by an LTA. The difference with TA is that 
a term t is recognized by an LTA if eval(t) can be reduced in the LTA. 



Definition 3 (ti — >a ^i f° r lattice tree automata). Let t\,t<i G T(T\A Q). 
t\ ~^A ^2 iff) for any position p G pos(ti) : 

— if t\\ p G T{^F»), there is a transition A — > q G A such that eval(tx\ p ) C A 
and £2 = *i [g]p 

— if ti\ p = a where a G J-^, there is a transition a — > q G A such that t% — 

h[q] P - 

— if h\ P = /(si,...,s„) where f G J 7 " and Si,...s n G T(JUQ), 3s^ G 
T(-F U Q) such that s t -^ s\ and t 2 = ii[/(si, . . . , s^_i, sj, Sj+i, . . . , s„)] p . 

— >^4 is the reflexive transitive closure of —^4. There is a run from t\ to t 2 if 

£l ^^4 ^2- 

The set T( J 7 , Atoms (A)) denotes the set of ground terms built over (J 7 \ 
A) U Atoms(A). Tree automata with lattice recognize a tree language over 

T(F,Atoms(A)). 

Definition 4 (Recognized language). The tree language recognized by A in 
a state q is C(A, q) = {t G T{J-, Atoms (A)) | 3 t' such that t\—t' and t' — ^ q}. 
The language recognized by A is C{A) — {J„ e o . C(A,q). 

Example 3 (Run, recognized language). Let A = (J 7 = J- Q U J, , Q,Qf,A) be 
an LTA where A = {[0,4] — > qi,f(qi) — > q 2 } and final state q 2 . We have: 
/([1,4]) — >* q 2 and /([0,2]) — >* q 2 , and the recognized langage of A is given by 
£(A92) = {/([0,0]),/([l,l]),...,/([4 1 4])}. 



3.4 Operations on LTA 

Most of the algorithms for Boolean operations on LTA are straightforward adap- 
tations of those defined on TA (see [15]). 

LTM are closed by union and intersection, and we shortly explain how these 
two operations U and D can be performed on two LTAs A — (J 7 , Q, Qf, A) and 
A' = (T,Q',Q' p A>): 

- A U A' = (J 7 , Q U Q', Q/ U Q' f , A U Z\') assuming that the sets Q and Q' are 
disjoint. 

- An A' is recognized by the LTA AnA' = {J 7 ,QxQ',Q f x Q' f ,A n ) where 
the transitions of A n are defined by the rules: 

X^ge A X' -> <?' G A' AnA'^1 
XnX'^(q,q r ) 

and 

f(q ly ...,q n )^qeA m, ...,q> n )->q'eA' 

f({qi,q'i),---,(<in,q'n)) -> (?,?') 



Assuming that the LTA is deterministic, the complement automaton is ob- 
tained by complementing the set of final states. To decide if the language de- 
scribed by an LTA is empty or not, it suffices to observe that an LTA accepts 
at least one tree if and only if there is an reachable final state. A reduced au- 
tomaton is an automaton without inaccessible state. The language recognized 
by a reduced automaton is empty if and only if the set of final states is empty. 
As a first step we thus have to reduce the LTA, that is to remove the set of 
unreachable states. 

Let us recall the reduction algorithm: 

Reduction Algorithm 
input: LTA A = (J", Q, Q f ,A) 
begin 

Marked--® 

/* Marked is the set of accessible states */ 
repeat 

if 3a e T° = T° U J 7 ?" such that a^q e A 

or 3/ € T n = J 7 ™ U J 7 *" such that f( qi , ..., q n ) -> q E A 
where q\ , . . . , q n € Marked 
then Marked :— Marked U {q} 
until no state can be added to Marked 
Q r := Marked 
Q rj : = Qf n Marked 

A r ■= {/(<7i,...,g„) ->?e A\q,q 1 ,...,q n € Marked} 
output: Reduced LTA A r = {J 7 , Q r , Q rf , A r ) 
end 

Then, let A be an LTA and A r = (J- : Q r , Q rf , A r ) the corresponding reduced 
LTA, C(A) is empty iff Q rf = 0. 

Let A, A 2 be two LTA. We have C(Ai) C C{A 2 ) <& C{Ai n^") = 0. 

Complementation and inclusion requires an input deterministic LTA. How- 
ever, by adapting the proof of finite- word lattice automata given in [T^i], one 
can show that LTA are not closed under determinization. In the next section, 
we propose an algorithm that computes an over-approximation deterministic au- 
tomaton for any given LTA. This algorithm, which extends the one of [19], relies 
on a partition function that can be refined to make the overapproximation more 
precise. 

3.5 Determinization 

As we shall now see, an LTA A = {J 7 , Q, Qf,A) is deterministic if there is 
no transition f(qi, ■■■,q n ) -^ q, /(<?i, ■■■,q n ) — > q' in A such that q ^ q' ', where 
/ G Tn-, and no transition Ai — > q, X 2 — >• q' such that q ^ q' and Ai l~l \ 2 ^ J-, 
where Ai, A2 G A. As an example, if A = {[1, 3] — > q\, [2, 5] — > q 2 }, then we have 
that A is not deterministic. 



Determinizing an LTA requires complementation on elements on lattice. In- 
deed, consider the LTA A having the following transitions [—3, 2] —5- q\ and 
[1,6] — > 92 ■ The deterministic LTA corresponding to A should have the follow- 
ing transitions: [—3, 1[— >• q\, [1,2] —} {#1,(72} and ]2,6] — > 92. To produce those 
transitions, we have to compute [—3,2] l~l [1,6] = [1,2], and then [—3,2] \ [1,2] 
and [1, 6] \ [1, 2]. Unfortunately, there are lattices that are not closed under com- 
plementation. As a consequence, determinization of an LTA does not preserve 
the recognized language. 

The solution proposed in |19) for word automata is to use a Unite partition 
of the lattice A, which commands when two transitions should be merged using 
the lub operator. The fusion of transitions may induce an over-approximation 
controlled by the fineness of the partition. 

Partitioned LTA. LI is a partition of an atomic lattice A if LI C 2 A and 
Wi, 7T2 € 77, 7i"i n7T2 = -L, and Va G Atoms(A), 3n G LI : a C n. As an example, if 
A is the lattice of intervals, we can have a partition LI = {] — 00, 0[, [0,0], ]0, +oo[}. 

Definition 5 (Partitioned lattice tree automaton (PLTA)). A PUT A A 

is an LTA A = (LI, Q, T , Qf, A) equipped with a partition LI , such that for all 
lambda transitions A — >• q G A, 3n E LI such that A Q n. 

A PLTA is merged if Ai — > q, A2 — >• q € A A Ai Q 7Ti A A2 E ^2 => 
7Ti l~l 7T2 = 0, where Ai, A2 G A and m, ^2 G LI . 

For example, if LI = {] — 00, 0[, [0, 0], ]0, +oo[}, a PLTA can have the following 
transition rules : [-3,-1] -> gi, [-5,-2] -> <j 2 , [3,4] -J> g 4 . This PLTA is not 
merged because of the two lambda transitions [—3, —1] — > q\ and [—5, —2] — > qi, 
because [—3, —1] and [—5, —2] are in the same partition. The merged correspond- 
ing one will have the following transition : [—5, —1] — > qi,2, instead of the two 
transitions mentionned before. 

Any LTA A can be turned into a PLTA A p the following way : Let LI be 
the partition. For any lambda transition A — >• q G A, if 37i"i , . . . , 7r„ G LI such 
that A n 7Ti 7^ 0, . . . , A n 7r n 7^ 0, where m ^ . . . ^ 7r n , the transition A — > q will 
be replaced by n transitions A n m — > q, . . . , A n 7r„ — > q in A p . 

Example 4. Let A = (Q,J", Q/,^\) be an LTA such that Zi = {[3,4] -> 
91, [-3, 2] -> q2,f{q\-,q2) ->■ 9/}, and 77 = {] - co, 0[, [0, 0],]0, +oo[} be a par- 
tition. Then the corresponding PLTA is _4 p = (Q,F, Qf, A p ), where A p = 
{[3, 4] -»■ 9l , [-3, OK 92, [0, 0] -> 92, ]0, 2] -»■ q 2 , f(q u q 2 ) -> ?/ }. 

Two lambda transitions Ai — > 9, A2 — > 9 of a PLTA can not be merged if 
Ai and A2 belong to different elements of the partition, whereas they might be 
merged in the opposite case. 

Proposition 1 (Equivalence between LTA and PLTA). Given an LTA 
A — (Q,J-,Qf,A) and a partition LI, there exists a PLTA A' = 
(LI, Q,J-, Qf,A') recognizing the same language. 



Proof. A' is obtained from A by replacing each lambda transition A — > q 6 A by 
at most njj transitions A^ — >• q where A^ — A n m, 7Tj G 77, such that |J Aj = A. 

Any PUT A A = (77, Q, J 7 , Qf, A) can be transformed into a merged PUT A 
A m = (77, Q,T, Qf,Am) such that C(A) C C{A m ) by merging transitions as 

qeQ Treil A m = |J{An7r,Ae4|A^geZ\} 

follows : — 

A m -^ q E A m 

Example 5. li A = (TI,Q,T ,Qf,A), where 77 =] - oo,0[, [0,+co and A = 
{[0,2] -4 qi,[5,8] -4 g 2 , [-3, -2] -4 g 3 , [-4, -1] -4 q 4 ,h( qi ,q 2 ,q 3 ,q 4 ) -4 g/}, 
the merged automaton _4 m = (77, Q, T , Q/, zlm) corresponding to A has the fol- 
lowing transitions: A m = {[0,8] -4 gi, 2 , [-4,-1] -4 93,4X91,2,91,2,93,4,93,4) -► 
Qf}- 

We are now ready to sketch the determinization algorithm. The determiniza- 
tion of a PUT A, which transforms a PUT A A to a merged Deterministic Par- 
titioned UT A Ad according to a partition 77, mimics the one on usual T A. 
The difference is that two A-transitions Ai — > q\ and A2 — » 92 are merged in 
Ai l~l A2 -4 {91,92} when Ai and A2 are included in the same element 7r of 
the partition 77. Consequently, the resulting automaton recognizes a larger lan- 
guage : C(A) C £(„4(i)-This algorithm produces the best approximation in term 
of inclusion of languages. 

Determinization Algorithm : 
input: PLTA A = (77, Q, T, Q f , A) 
begin 

Q d := 0; A d = 0; 
for all it e 77 do 

Trans{ir) := {A -4 q € A\\ € A, A E tt}; 

s := {9 G Q|A ->(j£ 7>ans(7r)}; 

Qd := QdU{s}; 

A m := [_\{M^ ->■ 9 6 7Yans(7r)}; 

Ai := Z\ d U {A m -4 s}; 
end for 
repeat 

Let / e J"„, si,...,s n e Q d , 

s := {9 G Q|3gi 6 si,...,9„ e s„,/(gi, . . . ,g„) ->g£ Zi}; 
Qd := QdU{s}; 

Ai := A* U {/(*!,...,«„)-► a}; 
until no more rule can be added to A d 

Qd f ■■= {s e Q d \s n Q f / 0} 

output merged DPLTA A d = (77, Q d , J", Q d/ , A) 
end 



Example 6. Let Z\ = {[-3,-1] -4 91, [-5, -2] -4 92, [3,4] -4 g 3 ,[-3,2] - 

94,/(9i,92) -4 95,.f(93,94) ->■ 96, 7(95, 96 ) ->■ 9/1, /(<75, 96 ) ->■ 9/2}, and 77 
{]-co,0[,[0,0],]0,+oo[} 



With the determinization algorithm defined above, we obtain this set of 
transition for the deterministic corresponding PLTA : Ad = {,[— 5,0[— > 

51,2,4, ]0, 4] ->■ q 3 , 4 , [0,0] -)■ q4,f(qi,2,4,qi,2,i) -> 95,7(93,4,93,4) -> 
96,/(<73,4,94) ^96,/(93,4,gi,2,4) ~> 96, /(?5, 96 ) "^ 9/l,/2>- 

Proposition 2. Deterministic PLTA is the best upper- approximation 

Let A\ be a PLTA and A 2 the PLTA obtained with the determinization 
algorithm. Then A 2 is a best upper- approximation of Ai as a merged and deter- 
ministic PLTA. 

1. C(Ai) C C(A 2 ) 

2. For any merged and deteministic PLTA A3 based on the same partition as 
At, C(Ai) C C(A 3 ) =*■ C(A 2 ) C C(A 3 ) 

Proof (Proposition^). 

(1) Base case : for all lambda transitions of At A — > q, let 7r G 77 such 
that A C 7T. Then Trans(-K) = {A -> q G Zi|A £ A,A C 7r}. Then there is 
a transition A' — s> Q in ,4 2 such that A' = |J{A|A — > q G Trans (w)} and 
Q = {9|A — > q G Trans(7r)}, so q E Q. 

induction case : for all non lambda transition of A\ f(q±, . . . ,q n ) — > q, there 
is the corresponding transition f(Qi, . . . , Q n ) — > Q such that q £ Q. We have 
qx G Qi, . ■ . , q n G Qn thanks to the induction hypothesis. 
So £(Ai) C £(.4 2 ). n 

(2) ^ = (n,Q u ^,Q fl ,Ax), A 2 = (n,Q 2 ,F,Q h ,A 2 ) and A3 = 
(77, Q 3 , ^Q/3,^3) 

As£(A) C£(A)(l)and£(A) C £(_A 3 ), let fti : QixQ 2 and^ 2 : QixQ 3 
be two simulation relations defining these properties as follows. 
Let q x G Qi and q 2 G Q2, (91, 92) G T^i iff 

— Ai — >• gi G Z\i, A 2 — s> 92 G Z\ 2 and Ai C A 2 , where Ai, A 2 G /l, 
or 

f(Qii,---,Qi n ) -> 9i G ^1, /(o^, •••,?*„) -> 92 G Zi 2 and Vj G 
[l,n], (qi^q'ij) G 7^i, where /eJ„ 

- 9i G S/j <*=3> 92 G Q/ 2 



Let qi G Qi and q 3 G Q 3 , (91, 93) G 7£ 2 iff 

— Ai — > 91 G Z\i, A3 — s> (73 G Zi 3 and Ai C A3, where Ai, A3 G A, 
or 

/(&i, •••,%„) -> 9i G Zii, /(<4,...,q^J ->■ q 3 G Zi 2 and Vj G 
[l,n], {qi^q'i.) G 72. 2 , where /eJ„ 

- 9i G Q/j ^=^ 93 G Q/ 3 



Let 72 : Q2 x Q3 be a simulation relation such that (92,93) G 72. iff 
3<?i G Qi-(<?i,<?2) G 72.1 A (91,93) G 72 2 , where 92 G Q 2 , 93 G Q 3 . 

Let (92,93) G 72. This means that : 

— Ai — >• 91 G Zii, A 2 — ► 92 £ A 2 , A 3 — ► 93 G A 2 and Ai C A 2 and Ai C A3, 
where Ai, A2, A3 G yl (a) 

, or 

f(q il ,...,q in ) -*qi 6 4, /(g^, • • • , q' in ) ^92 G Zi 2 , /«,•••, <) -> 93 G 

Zi 3 and Mj G [l,n], (%,g^) G 72 1 and {qi^q".) G 72 2 , where / G J*„ (b) 

- 9i G Q/ x ^=^ 92 G Q h and 91 G Q/ x ^=^ 93 G Q/ 3 (c), 
by definition of 72i and 722- 

(a) Let 7r G 77 be the element of the partition such that Ai C n. Then 
Trans(ir) = {A — >• 9 G Z\| A G vl, A C 7r}, i.e the set of all the lambda transitions 
A — > q in Z\i such that A C 71". Of course Ai C Trans(7r), because Ai C 7r. 
Then A2 is the least upper bound of all A G A such that A — >• 9 G Trans(ir), i.e 
A2=|J{A|A^9G Trans (n)}, according to the determinization algorithm. 

As A3 is deterministic and contains A\, then A3 has to contain at least all 
the A G A such that A — > q G A\ and A C 7r, or else .A3 is not deterministic. 
So A 3 Zl |J{A|A ->i?e Trans{ir)}, so A 2 C A3. 

(b) We can immediately deduce that Vj G [l,n], (9^,9") G 72 by the 
definition of 72. 

(c) So 92 G Q /2 <=> 93 G Q/ 3 

And thanks to these properties deduced on 72 : Q\ x Q2, we can deduce that 
£(A 2 ) C £L4 3 ). 

As the least upper bound of two elements of a lattice is the best and 
unique upper-approximation, this determinization algorithm returns the best 
upper- approximation. □ 



3.6 Minimization 

To define the minimization algorithm, we first have to define a Refine recursive 
algorithm which refines an equivalence relation P on states, according to the 
PUT A A. 



Refine (P,A) 
begin 

Let P' be a new equivalence relation; 



For all (g, q') e Q such that qPq' do 

if (v/ g r\ 

A{f(qi, ■■, Qi-i,Q, Qi+i,- ■ ■ > Qn))PA(f(Qi,- ■ ■ , Qi-i,q', Qi+i, ■ ■ ■ , in)), 
where q u . . . , g 4 „i, q i+1 , . . . , q n G Q) 
AND (Vae^a^^a^ q') 
AND (VAi,A 2 eA, 3tt e 77 

such that Ai — >• q =>■ A2 — >• g' and Ai, A2 G 7r) 
THEN qP'q 
ELSE if P = {Q l7 ...,&,..., Q n } and g, g' G Q, 

then P := {&, . . . , &_!, Q^ , Q h , Q i+1 , . . . , Q„}; 

ge Q 4l ;g' e Q* 2 ; 
Refme(P'); 
end 



We are now ready to define the minimization algorithm of a PLTA A. 



MinimizationAlgorithm(„4) 

input: Detcrminizcd PLTA A = (77, Q, J", Q/, A) 

An equivalence relation P = {Q/, Q \ Q/} 
output: Minimized and determinized PLTA A$ = (77, Q m , T, Qf m ,A m ) 
begin 

Refine(P, A); 

Set Q m to the set of equivalence classes of P; 

/* we denote by [g] the equivalence class of state g w.r.t. P * / 

For all A-transitions, for all Ai, A2 G A, 
if Ai -4 g, A 2 -4 g' £ Zi and gPg' 
then Ai U A 2 ->■ [g, g'] G Zi m ; 

For all other transitions, A m := {(/, [g x ], ..., [g„])[/(gi, ...,g„)]}; 

Q m/ :={[g]|gGQ/}; 

end 



A normalized PLTA is an LTA that is a merged, deterministic and minimized 
PLTA. 

Proposition 3. Normalized PLTA is the best upper- approximation Let A\ be 
a PUT A and Ai the PUT A obtained with the minimization algorithm. Then A2 
is a best upper- approximation of A\ as a normalized PLTA. 

1. C(Ai) C C{A 2 ) 

2. For any normalized PLTA A3 based on the same partition as A\, C{A\) C 
C(A 3 ) => C(A 2 )CC(A 3 ) 

Proof : 



Let P be the equivalence relation at the end of the minimization algorithm. 

(1) Base case : for all lambda transitions of A\ A -4 q, there is a transition 
A' ->■ [q] in .A 2 such that A' = |J{ A I^ ->■ g' G A A g'Pg}. 

induction case : for all non lambda transitions of Ai f(qi,---,q n ) —* 9, 
there is the corresponding transition f([qi],...,[q n ]) —> [q] (where q G [q], 
9i G [9i],---,9™ G [g„]). 
So £(^i) C £(A 2 ). a 

(2) ^! = (77, Q 1 ,F 1 Q hl A 1 ), A 2 = (n,Q 2 ,F,Q h ,A 2 ) and A 3 = 
(n,Q 3 ,F,Q h ,A 3 ) 

As £(^i) C £(.A 2 ) (1) and £(.4i) C £(.4 3 ), let 72-i : Qi x Q 2 and 72 2 : Qi x Q 3 
be two simulation relations defining these properties as follows. 
Let qx G Qi and g 2 G Q 2 , (gi, <j 2 ) G 72. i iff 

— Ai — > q\ G Z\i, A 2 — > q 2 G Z\ 2 and Ai C A 2 , where Ai, A 2 G /l, 
or 

/(*!,••-,*„) ->• 91 G Zii, /(«i 1 ,...,?i n ) -^ <?2 G Zi 2 and Vj G 
[l,n], (qij,^) G 72i, where / G J"„ 

- 9i G Q /x ^=^ <72 G Q/ 2 



Let <ji G Qi and g 3 G Q 3 , (gi, g 3 ) G 72 2 iff 

- Ai — >• gi G Zii, A 3 — >• g 3 G A3 and Ai C A 3 , where Ai, A 3 G A, 
or 

/(*!,••-,*„) ->• 91 G Zii, /(«i 1 ,...,?- n ) -^ g 3 G Zi 2 and Vj G 
[l,n], (%,?[,) G 72 2) where / G J"„ 

- 9i G Q /x ^=^ g 3 G Q/ 3 

Let 1Z : Q 2 x Q 3 be a simulation relation such that (92,93) G 72 iff 
39i G Qi-(9i,92) G 72.1 A (91,93) G 72 2 , where q 2 G Q 2 , g 3 G Q 3 . 

Let (g2,g 3 ) G 72. This means that : 

- Ai ->• gi G Al, A 2 -)• g 2 G A 2 , A 3 -4 g 3 G A 2 and Ai C A 2 and Ai C A 3 , 
where Ai, A 2 , A3 G A (a) 

, or 

f(q n ,...,q t J -+q t G/ii, f{q[ l ,...,q' l J -^ q 2 E A 2 , /(«£,... ,#„) -► «5 G 

Zi 3 and Vj G [l,n], (%,?[,) G 72 1 and {qi^q".) G 72 2 , where / G T n (b) 

- 9i G Q fl ^=^ g 2 G Q/ 2 and gi G Q/ x ^=^ g 3 G Q/ 3 (c), 
by definition of 72i and 72 2 . 



(a) We have Ai — > qx € Ax, A2 — ► qi € -42 and Ai Q A 2 . According to the 
minization algorithm, A2 is the least upper bound of all A G A such that there 
exists q G Qx such that A — > q G Z\i and q is in the same equivalence classe as 
qi (i.e., q G [qi] or qPqi). Formally, A 2 = LK A I A ^ 9 £ ^1 A q^i}- 

As ^3 is minimized and contains Ax, then A3 has to contain at least all the 
A G A such that A — > q G A\ and qPqx, or else ^3 is not minimized. 

So A 3 3 Ul-^l^ -* ? e ^i a g-Pgi}, so A 2 E A3. 

(b) We can immediately deduce that Vj G [1, n], (^.,9") G 7?. by the 
dchnition of 72.. 

(c) So q 2 G Q /2 ^=> q 3 G Q/ 3 

And thanks to these properties deduced on 72 : Qx x Q 2 , we can deduce that 
£(-4 2 ) c C(A 3 ). 

As the least upper bound of two elements of a lattice is the best and 
unique upper-approximation, this minimization algorithm returns the best 
upper-approximation. □ 



3.7 Refinement of the partition 

In the previous paragraphs, the partition 77 was fixed. The precision of the 
upper-approximations made during the determinization algorithm depends on 
the finess of 77. For example, if 77 is of size 1, all A-transitions will be merged 
into one. 

Definition 6 (Refinement of a partition). 

A partition 77 2 refines a partition U\ if : 

V7r 2 G 77 2 , 3tti G 77i : 7r 2 C wi 

Let Ax = (LTx,Q,F,Qf,Ax) be a PUT A. The PLTA A 2 = 
(77 2 , Q, J", Q f , A 2 ) refines Ax if : 

1. 77 2 refines TIx 

2. the transitions of A% are obtained by : VA — > q G A\, V7r 2 G 77 2 , A n 7r 2 — > 

qe A 2 

Refining an automaton does not modify immediatly the recognized language, 
but leads to a more precise upper-approximation in the determinization, as il- 
lustrated herafter. 

Example 7. Given 77 and A of example [5] and a partition 77 2 = {] — 
oo,-l[,[-l,0[,[0,0],]0,+oo[} that refines 77, the set of transitions A 2 of PLTA 
obtained with 77 2 is A 2 = {[-3, -![-> qx, [-1, -1] -¥ qx, [-5, -2] ->■ q 2 , [3,4] ->■ 



93, [-3, -l[->- 94, [-l,0[-> 94, [0,0] ->■ <74,]0,2] -4 94,/(?i,92) ->■ 95, 7(93,94) -> 
96, 7(95, 96 ) ->■ 9/i,/(95,9e) ->• 9/2>- 

We now obtain this set of transitions for the deterministic corresponding 
PLTA with 7T 2 : Z\ 2d = {[_5,-l[-» 91,2,4, [-1,0[^ 9i,4,]0,4] -4 93,4, [0,0] -> 

94,/(9i,2,4,9i,2,4) ->■ 95,/(9i,4,9i,2,4) ->■ 95,/(93,4,93,4) ->■ 96,7(93,4,94) -> 

96,7(93,4,91,2,4) ->■ 96,7(93,4,91,4) ->■ 96, 7(95, 96 ) ->■ 9/1./2}- 



4 A Completion Algorithm for LTA 

We are interested in computing the set of reachable states of an infinite state 
system. In general this set is neither representable nor computable. In this paper, 
we suggest to work within the Tree Regular Model Checking framework for 
representing possibly infinite sets of state. More precisely, we propose to represent 
configurations by (built-in)terms and set of configurations (or set of states) by 
an LTA. 

In addition, we assume that the behavior of the system can be represented 
by conditional term rewriting systems (TRS), that are term rewriting systems 
equipped with conjunction of conditions used to restrain the applicability of the 
rule. Our conditional TRS, which extends the classical definition of [?], rewrites 
terms defined on the concrete domain. This makes them independent from the 
abstract lattice. We first start with the definition of predicates that allows us to 
express conditions on TRS. 

Definition 7 (Predicates). LetV be the set of predicates overT>. For instance 
if p is a n-ary predicate ofV then p : V n n> {true, false}. We extend the domain 
of p to T(J- ', X) in the following way: 

{p(ui, . .. ,u n ) if Mi = l...n:ti € T(T.) 
where Vi = 1 . . . n : Ui = evalifi) 
false if 3j = 1 . . . n : tj <£ T(J.) 

Observe that predicates are defined on built-in terms of the concrete domain. If 
one of the predicate parameters cannot be evaluated into a built-in term, then 
the predicate returns false and the rule is not applied. 

Definition 8 (Conditional Term Rewriting System on T(J- U J-,,X)). 
In our setting, a Term Rewriting System (TRS) 1Z is a set of rewrite rules 

I ->• r <*= a A . . . A c n , where I e T{T ,X), r e T{T,X) = T{T U T.,X), 
I $. X , Var(l) Z> Var(r) and Vi = 1 . . . n : Cj = Pi{t\, ■ ■ ■ , t m ) where pi is a m-ary 
predicate of V and V? = 1 . . .m : tj £ T(T», X) A Var(tj) C Var(l). 

Example 8. Using conditional rewriting rules, the factorial can be encoded as 
follows: 

fact(x) -^1^j;>0Ai<1 

fact{x) — > x * fact(x — 1 ) ■<= x > 2 



The TRS 1Z and the eval function induces a rewriting relation — >-ji on 
T(.F) in the following way : for all s,t 6 T{J-), we have s — s^ £ if there 

exist (1) a rewrite rule I — > r 4= c\ A . . . A c n e 7?., (2) a position p 6 Vos(s), 
(3) a substitution er : Af h- ^ T{J~) such that s| p = Zer, £ = euaZ(s[r<7]p) and 
Vi = 1 . . . n : Cicr = true. The reflexive transitive closure of — s^ is denoted by 

Our objective is to compute an LTA representing the set (or an over- 
approximation of the set) of reachable states of an LTA A with respect to 
a TRS 1Z. In this paper, we adopt the completion approach of [?I17]. which 
intends to compute a tree automaton A^ such that £(.4^) 2 1Z*(C{A)). The al- 
gorithm proceeds by computing the sequence of automata A 1 ^, A^, A\, ... that 
represents successive applications of 1Z. Computing A^ 1 from A l n is called a 
one-step completion. In general the sequence of automata may not converge in a 
finite amount of time. To accelerate the convergence, we perform an abstraction 
operation which accelerate the computation. Our abstraction relies on merging 
states that are considered to be equivalent with respect to a certain equivalence 
relation defined by a set of equations. We now give details on the above con- 
structions. Then, we show that, in order to be correct, our procedure has to be 
combined with an evaluation that may add new terms to the language of the 
automaton obtained by completion or equational abstraction. We shall see that 
this closure property may add an infinite number of transitions whose behavior 
is captured with a new widening operator for LTA. 

4.1 Computation of A4+1 

In our setting, A^ is built from A % ^ by using a completion step that relies 
on finding critical pairs. Given a substitution a : X t— > Q and a rule I — > r ■$= 
c\ A . . . A c„ £R, a critical pair is a pair (re', q) where 96 Q and a' is the 
greatest substitution w.r.t Q such that la — >* Ai q, a 3 a ' and c\&' A ... A c n o~' . 

Observe that since both 1Z, A l ni Q are finite, there is only a finite number 
of such critical pairs. For each critical pair such that ra' -/>* Ai q, the algorithm 

adds two new transitions ra' —> q' and q' — > q to A\^. 

Building critical pairs for a rewriting rule I — > r requires to detect all sub- 
stitutions a such that la — >* q, where q is a state of the automaton. In what 
follows, we use the standard matching algorithm introduced in |17j . This algo- 
rithm Matching(l,A,q), which is described hereafter, matches a linear term I 
with a state q in the automaton A. The solution returned by Matching is a 
disjunction of possible substitutions a% V . . . V a n so that lai — >* A q. 

Let us recall the standard matching algorithm: 

/(«!,-,«„) </(gl,-,gn) /(«l,-,Sn)<g(g / l,-g' m ) 

si < qi A • • • A s n < q n 1 

(Config) — S ~ q ^ ^-r, V«i, s.t. m -»■ q € A, if s $ X. 



Moreover, after each application of one of these rules, the result is also rewrit- 
ten into disjunctive normal form, using: 

01 A (0 2 V 3 ) 0i V _L 0! A _L 



(01 A0 2 )V(0i A0 3 ) 0i ± 

However, as our TRS relies on conditions, we have to extend this matching 
algorithm in order to guarantee that each substitution <7j that is a solution of 
I — > r <= c\ A . . . A c n satisfies c\ A . . . A c„. For example, given the rule fix) — > 
f(g{xj) <^x>3Ax<7 and the transitions [2,8] — > qi, f(qi) — > <Z2, we have 
that the set of substitution returned by the matching algorithm is {n-> [2, 8]}, 
which is restricted to [3, 7]. 

Restricting substitutions is done by a solver on abstract domains. Such solver 
takes as input the lambda transitions of the automaton and all conditions of the 
rules, and outputs a set of substitutions of the form a' — {x \-t X x ,y ^ X y }. 
Such solvers exist for various abstract domains (see [?] for illustrations). In the 
present context, our solver has to satisfy the following property: 

Property 1 (Correction of the solver). Let a — \x\ i-* qi,...,Xk •— > qk} be a 
substitution and c = ci A • • • A c« a conjunction of constraints. We consider 
a jc — {xi i-> qi | 31 < j < n, X{ G Var(cj)} the restriction of the substitution to 
the constrained variables. We also define S c = {i \ 31 < j < n, X{ 6 Var(cj)}. 

For any tuple (A.;|i G S c ) such that A^ — V\ qi, SolveAip / 'c, (Xi\i S S c ),c) is a 
substitution a' such that (1) if i £ S c , a'(xi) = qi, and (2) if i £ S c , a'(xi) = X[. 
In addition, if a tuple of abstract values {\"\i G S c ), satisfies (a) Vi 6 S c , A" Q Xi, 
and (b) VI < j < n, the substitution a" jc = {xi M> A'/} satisfies Cj, then Vi G S Cl 

Using ProplU the global function Solve(a, A, C\ A ■ ■ ■ A c„) is defined as: 
Solve(a, A, C\ A • • • A c n ) = M 5o/ueyi(cr/c, {Xi\i G S c ), c) 

The following theorem ensures that Solve(a, A, c\ A • • • A c n ) is an over- 
approximation of the solution of the constraints. 

Theorem 1. Solve(a, A, ci A ■ ■ ■ A c n ) is an over-approximation of the solutions 
of the constraints. 

Proof. By ProplU we have that for any tuple (Xi \i G S c ) such that Xi — >^ qi, then 
SolveAi? I c, (Ai|« G S c ),c) is a substitution a' such that if i G S c , cr'(xi) = X' { . 
Let (A"|i G S c ) be a tuple such that VI < j < n, we have that the sustitution 
a" jc — {xi i-> A"} satisfies c^. Thanks to PropfTJ we have that Vi G S c , A" C A^. 
Since for all i G S'c, A^ is returned by the solver, we can deduce that the set of 
substitutions returned by the solver is an over-approximations of the solutions 
of the constraints. 



Depending of the abstract domain, denning a solver that satisfies the above 
property may be complex. However, we shall now see that an easy and ele- 
gant solution can already be obtained for interval of integers. As we shall see 
in Section [6l such lattices act as a powerful tool to simplify analysis of Java 
programs. Observe that the algorithm for computing SoIvca{o~/c, (Xi\i £ S c ),c) 
depends on the lattice A and on the type of constraints of c. If c is a conjunc- 
tions of linear constraints and A the lattice of intervals, the algorithm computing 
SolveA{o, (Ai,. . ., A fc ),ci A • • • A c„) is: 

1 . Pi is the convex polyhedron defined by the constraints c\ A • • • A c ra , 

2. P2 is the box defined by the constraints x\ 6 X±, . . .Xk £ A&, 

3. if P\ \~\P2 , then we project P\ \~\P2 on each dimension (i. e. on each variable Xk) 
to obtain k new intervals. Otherwise, SolveA(o~, (Ai, . . . , A&), c\ A- • -Ac n ) = 0. 

Definition 9 (Matching solutions of conditional rewrite rules). Let A be 

a tree automaton, rl = I — > r <= ci A. . .Ac n a rewrite rule and q a state of A. The 
set of all possible substitutions for the rewrite rule rl is fi{A,rl,q) = {a' \ a G 
Matchingil, A, q) A er' e Solve(cr, A,C t A...A c„) A $a" : ra' C ra" -^ A * q} ■ 

Once the set of all possible restricted substitutions ai has been obtained, 
we have to add the rules ro~i — >* q in the automaton. However, the transition 
ro~i — > q is not necessarily a normalized transition of the form f(q±, . . . , q n ) -^ q, 
which means that it has to be normalized first. Normalization is defined by the 
following algorithm. 

Definition 10 (Normalization). Let s 6 T(JL)Q), q E Q, T % the set of 
concrete interpretable symbols used in the TRS , and A = (J 7 , Q, Q/, A) an LTA, 
where = J-, U Jo, and a : J-, — > T, the abstraction function. A new state is 
a state of Q not occurring in A. Norm(s — >* q) returns the set of normalized 
transitions deduced from s. Norm(s — Y* q) is inductively defined by: 

1. if s £ J 7 ^ (i.e., in the concrete domain used in rewrite rules), Norm(s — Y* 
q) = {a(s) -» q}. 

2. if s e T" U J : f then Norm(s ->* q) = {s -)■ q}, 

3. if s = /(ti,..., t n ) where f e F% U J"™, then Norm(s -►* q) = 
{f(q[, ■ ■ ■ ,q' n ) ->■ q} U Norm(ti -> g x ) U . .. U Norm(t n -)■ <j^) w/iere /or 
i = 1 . . .n, q[ is either: 

— the right-hand side of a transition of A such that ti — >^ ^ 

— or a new state, otherwise. 

Observe that the normalization algorithm always terminates. We conclude by 
the formal characterization of the one step completion. 

Definition 11 (One step completed automaton C-jz(A)). Let A = 
{J 7 , Q, Qf, A) be a tree automaton, 1Z be a left-linear TRS. We denote by C-ji(A) 
the one step completed automaton C-ji{A) = (J 7 , Q', Qf, A'} where: 



A' = A U I) Norm(ra ->•* q') U {(/ -> 9} 

l^relZ, q€Q, aen(A,l^>r,q) 

where f2(A, I — > r,q) is the set of all possible substitutions defined in Def\^ 
q' $ Q a new state and Q! contains all the states of A 1 . 

4.2 Equational Abstraction 

As we already said, completion may not terminate. In order to enforce termi- 
nation of the process, we suggest to merge states according to a set approxi- 
mation equations E. An approximation equation is of the form u = v, where 
u,v € T(J r o,A')- Let a : X M> Q be a substitution such that ua — >_4>+i q, 
va — > ,i+i q' and q 7^ q' , then we know that some terms recognized by q and 

q' are equivalent modulo E. An over-approximation of A^[ , which we denote 
A l -£ E , can be obtained by merging states q and q' . 

Definition 12 (merge). Let A = (F, Q, Qf, A) be an LTA and q%, 92 be two 

states of A. We denote by merge(A,q\,q2) the tree automaton where each oc- 
currence of <72 is replaced by q\ . 

Equations on interpretable terms. In what follows, we need to extend 
approximation equations to built-in terms. Indeed, as illustrated in the following 
example, approximation equations defined on T{J- , X) are not powerful enough 
to ensure termination. 

Example 9. Let f(x) — > f(x + 1) be a rewrite rule, {[1,1] — > qi,[2,2] — > 
92)/(<?2) — > qf} be transitions of an LTA, then successive completion and nor- 
malization steps will add transitions q2+qi — > 93, 93 + qi — >• <74, qi + qi — > (ft, • • ■ , 
qi + <7i - *■ li+ii ■ ■ -Unfortunately, as classical equations do not work on terms 
with interpretable symbols, this infinite behaviour cannot be captured. 

We define a new type of equation which works on interpretable terms, that are 
applied with conditions. Such equations have the form u = v <= c\/\. . .Ac„, where 
u,v G T{T a U J 7 ,,^). We observe that we can almost use the same matching 
algorithm than for completion. The first main difference is that we need to match 
a term t 6 T(J- U7 7 ., X) built on interpreted symbols on terms of T[T Q U7 7 . , X) 
recognized by the LTA A. The solution is to use the same matching algorithm 
on a(t) and A., i.e Matching(a(t),A,q). Contrary to the completion case, we 
do not need to restrict the substitutions obtained by the matching algorithm 
with respect to the constraints of the equation, but simply guarantee that such 
constraints are satisfiable, i.e., Solve(o~, A, C\ A • • • A c„) 7^ 0. 

Example 10. Equation x = .T + l<=a:>3 can be used to merge states 94 and 
q$ in Ex. [5] 

Theorem 2. Let A be an LTA and E a set of equations. We denote by ~~* E the 
transformation of A by merging equivalent states according to E. The language 
of the resulting automaton A' such that A ~~> E A' is an over- approximation of 
the language of A, i.e., C(A) C C(A'). 



Proof. Let A and A' two automata and E be a set of equations such that A ^ E 
A' . The set of transition of A' is the same as A with states merged according to 
equivalence classes determined by E. For all t e T{T ', A"), for all states q of .A, 
let Q = {gi, . . . , q, . . . , q n an equivalence class determined by E. We have that 
t € £(A 5) =^ t ->^ g =^ i -^, Q =*> t e C(A', Q). 

4.3 Evaluation and Correctness 

In this section, we formally define completion on LTA and its correctness. We 
first start with the evaluation of an LTA. 

Evaluation of a Lattice Tree Automaton. We observe that any set of concrete 
terms that contains the term 1 + 2 should also contains the term 3. While, this 
canonical property can be naturally assumed when building the initial set of 
states, it may eventually be broken when performing a completion step or by 
merging states. Indeed, let f{x) — > f{x + 1) be a rewrite rule and a : x M- q 2 a 
substitution, a completion step applied on {q\ — >• [1, 1], (72 — > [2, 3], f{q 2 ) — > qf} 
will add the rule f{qz) — > qi, q 2 + qi —t qs, and q% — >• qf. Since the language 
recognized by q^ contains the term q 2 + qi, it should also contain the term [3,4]. 
Evaluation of this set of transitions will add the transition [3,4] — > q%. This is 
done by applying the propag function. 

Definition 13 (propag). 

vrovaa(A) _Mtf=U->?€4A ev<d(f(\i, . . . , A,)) Q A 
propag^) - <y A[J {ewaZ(/(Ai, . . . , \ k )) _► q}, otherwise. 

V/ € J. # " :Vq,qi,...,q k € Q : VAi,...,A fe e 4 : f(qi,...,q k ) -* q £ A A 
{Ai -^ ?!,..., A fe -^ g fe } C A 

Using propag, we can extend the ewa/ function to sets of transitions and to 
tree automata in the following way. 

Definition 14 (eua^ on transitions and automata). 

Let \iX the least fix-point obtained by iterating propag. 

— eval(A) = [iX.propag(X) U A and 

- eval{(T, Q, Q f , A)) = (JF, Q, Q f , eval(A)) 

Example 11. Let A = {[3,6] -> gi, [2,8] ->• q 2l qi + q 2 -t qzJiqz) -* qf}, then 
propag will evaluate the term [3, 6] + [2, 8] contained in the transition q\+q 2 — > 93, 
and add the transition [5, 14] — > q$ to the automaton. 

Theorem 3. £{A) C C(eval{A)) 

Proof. By definition of propag fDefJ13|) . we have that propag(A) = 
Delta if 3A — > q G A A eval{\\ • . . . • A&) C A or propag(A) — A U {ewaZ(Ai • 
. . . • Afe) — > q. In each case, A C propag(A). 

By definition of et>a^ (Def ll4p . eval(A) — /iX.propag(X) U A. Since Z\ C 
propag(A), we have that Z\ C eval(A). Then we can deduce that C(A) C 
£(ewZ(„4)). 



Observe that the fixpoint computation may not terminate. Indeed, consider 
A = {[3,6] — > gi, [2,8] — > q 2 ,qi + qi —t 92}- The first iteration of the fixpoint 
will evaluate the term [3,6] + [2,8] recognized by qi + q2 — >• q 2 , which adds the 
transition [5, 14] —¥ q 2 . Since a new element is in the state q 2 , the second iteration 
will evaluate the term [3, 6] + [5, 14] recognized by the transition q\ + q 2 — > <72 , 
and will add the transition [8,20] — > q 2 . The third iteration will evaluate the 
term [3, 6] + [8, 20] to q 2 and this pattern will be repeated in further operations. 
Since there will always be a new element of the lattice that will be associated to 
q 2 , the computation of the evaluation will not terminate. It is thus necessary to 
apply a widening operator Va : A x A »->• A to force the computation of propag 
to terminate. For example, if we apply such a widening operator on the example 
above, after 3 iterations of the propag function, the transitions: [2,8] — > q2, 
[5, 14] -)■ q 2 , [8, 20] -> q 2 could be replaced by [2, +oo[^ q 2 . 

Definition 15 (Automaton completion for LTA). Let A be a tree automa- 
ton, TZ a TRS and E a set of equations. At a step i of completion, we denote by 
AL e the LTA such that Ah ~^> E A^ E . 

— Repeat A^ E = A with C n {eval{A\ E )) ^' E A" and eval(A') = A', 

— Until a fixpoint A\ E — A^ E = A^ E (with k G N) is joint. 

A running example is described in section [5] 

Theorem 4 (Completeness). Let TZ be a left-linear TRS, A be a tree automa- 
ton and E be a set of linear equations. If completion terminates on A\ E then 

£(A^ E )^K*(£(A)) 

Proof. We first show that L{A\ E ) 2 C(A). By definition, completion only adds 
transitions to A. Hence, we trivially have C(A]i) 3 C(A). Thanks to Theorem^ 
we also know that A\^ E , the transformation of A]^ by merging states equivalent 
w.r.t. E, is such that C{A\i E ) 2 C{A\^). Hence, by transitivity of 15, we know 
that £(Ar, e ) 2 £(A). This can be successively applied to A\ E , A\ E , A^ E , . . . 
so that C(Ati e) — £(A). Now, the next step of the proof consists in showing 
that for all term s G C(A) if s — >fi t then t G C(A^ ZE ). First, note that by 
definition of application of E final states are preserved, i.e. if q is a final state in 
A then if A is the automaton where E are applied in A and q has been renamed 
in q', then q' is a final state of A . Hence it is enough to prove that for all term 
s G C(A,q) if s — >* n t then 3g' : t G C(A^ Z E ,q'). Because of previous result 
saying that C{A\ E ) 2 £(A), from s G C(A,q) we obtain that there exists a 
state q' such that s G C{A* n E ,q')- We know that s — >^ t hence, what we have 
to show is that t G C(A^ E , q'). By induction on the length of — Kj^, we obtain 
that: 

— if length is zero then s — ^ s and we trivially have that s G C(A^ E , q'). 



— assume now that the property is true for any rewriting derivation of length 
less or equal to n, we prove that the property remains valid for a derivation 
of length less or equal to n + 1. Assume that we have s — >^ s' — ^ t. Using 
induction hypothesis, we obtain that s' G C(A\ El q'). It remains to prove 
that t G £{A^ E , q') can be deduced from s' ^-n t. Since s' — ^ i, we know 
that there exist a rewrite rule I — > r <= C\ A . . . A c n , a position p and a 
substitution /i : X i— >• T(T) such that s' = s'[1/j] p — >7j eval(s'[rii] p ) = t 
and for all i G [l,7i], C{\x = true. Since s' G C(A^ E ,q'), s'[lfj] p — >^* q' 
and by definition of the langage of an LTA, we get that there exists s" 
such that s' C s" and s" ->■*,, 9'. We can deduce that s"\lu\ v -»•*■, o' 
and by definition of tree automata derivation, that there exists a state q" 
such that l/x — >\, q" and s"[<z"] p — >^* q' ■ Let Uar(Z) = {x\,. . . ,x n }, 
I = I [x\ , . . . , x n ] and t\ , . . . , t n G T{T) such that /i = {xi i— ► ti , . . . , x n n> 
£„}. Since Z/j = Z[fi, . . . , t n ] — >\* q" , we know that there exist states 

<7i, ..., q n such that Vi G [l,n], U — >^* ft and / [91 , . . . , q n ] — >^, q" . Let 
a = {xi i-t q\ , . . . , x n i-4 9„}, we thus have that Icr — >^» 9". Since A\ E is 
a fixpoint of completion, from la — >^» 9" and the fact that for all i G [1, n], 
Ci/i = true, we can deduce that ra — >^* q". Furthermore, since Vi G [1,ti], 
ij - >*4» qi, then r^i — >^, q" . Since besides of this s"[q"] p — >^» 9', we 

7Z. , E 7Z. . E 7Z. . E 

have that s"[r/x] p — ^» 9'- Since s' C s", this means by definition that 
eval(s') Q eval(s"). Finally, since s"[r/j] p — >^, q' and eval(s') C. eval(s"), 
we can deduce that t = eval(s' '[r /j] p ) — >*^, q' , hence t G £-{A*^ E , q'). 

Observe that the reverse does not hold as widening in evaluation may introduce 
over-approximations. 

Remark 1. We have two infinite dimensions, due to the state space, and due to 
infinite domain. The infinite behaviour of the system is abstracted thanks to the 
equations, and all the infinite behaviours due to the operations on elements of the 
lattice is captured by the widening step included in the evaluation step. Indeed, 
if we have lambda transitions added at each completion step with increasing (or 
decreasing) elements of the lattice (for example [0,2] — > q, [2,4] — > q, [4,6] — > q, 
. . . ), we have to perform a widening (here [0, +oo[) to ensure the terminaison of 
the computation. But an infinite increasing (or decreasing) sequence of lambda 
transitions is necessarily obtained from a predifincd operation of the lattice used 
in the rewrite rules. For example, the increasing sequence described above is 
necessarily obtained from a rewrite rule of the form u(. . . ,x, . . .) — > v(. . . ,x + 
2, . . .). If we have the matching x — > 91, and the rule [2, 2] -4 q2, then it will 
add the transition q\ + q^ — > q$ , and since this rewrite rule leads to an infinite 
behaviour (always adding 2), we would have an infinite sequence 93 + 92 —^ 94 ; 
94+92 —> 95, and so on. To solve this problem, it is necessary to use an equation of 
the form x = x + 2. Then, q\ is merged to 93 and we have a transition 91+92 — > 91 
with an infinite evaluation abstracted thanks to the widening step included in 
the evaluation step. To summarize, an infinite sequence of lambda transitions is 



necessarily obtained from an operation used in the rewriting system, and since 
the transitions of an LTA containing operations have to be evaluated, the infinite 
behavior is always solved during the evaluation step. We can observe this on the 
example described hereafter in [5] 



5 A running example 

Let N be the concrete domain, the set of intervals on N be the lattice, 
1Z = {f(x) — > cons(x, f(x + 1 )) -4= x < 3m), f(x) — > cons(x, f(x + 2)) <= x > 
2(s)} be the TRS, Aq the LTA representing the set of initial configurations, 
with the following set of transitions : A = {[1,2] — > qx,f(qi) — > (72}, and 
E = {x = x + 2^x>5} the set of equations. We decide to perform a widening 
after three steps. 

First step of completion 

One step completed automaton: we can apply the rewrite rule (A) with the sub- 
stitution x 1— Y qi, and so add Norm(cons{q\, f(q% + 1)) — > q'2) and q' 2 —¥ qi to 
A t . 

So we have A 2 = A^ U {cons(qi,q 3 ) -► q' 2 ,q' 2 ->■ 92, /(?4) -* q3,qi + 9[i,i] -> 
94,9[i,i] -> [M]}- 

Since there is new transitions, we have to perform the evaluation step : transition 
9i + 9[i,i] ~~ ^ 94 can be evaluated, so ewaZ(Z\2) = A 2 U {[2, 3] — >■ §4}. 
Abstraction by merging states according to equations: we cannot apply the set 
of equations yet because there is no state recognizing "x + 2" such that x > 5. 

Second step of completion 

One step completed automaton: we can apply the rewrite rules (A) and (B) with 
the substitution x H> (74, but this will be restricted by the solver. In fact, (A) 
will be applied on [2,2] (condition x < 3), and (B) will be applied on [3,3]. So 
2Vorm(cons([2,2],/([2,2] + l)) ->• g 3 ), JVorm(cons([3,3],/([3,3]+2)) ->■ g 3 ) and 
93 ~~* 93 w iU b e add to eval(A 2 ). 

So we have A 3 = eva/(Z\ 2 ) U {[2,2] -> g[2,2],cons(g[ 2)2 ],g5) -> 93><?3 "^ 
33, /(?e) -> 95,9[2,2] +9[i,i] ->• 96, [3, 3] -> 9[3,3],cons(g [3 , 3] , g 7 ) -> q 3 J(qs) -> 

97,9[3,3] + 9[2,2] ~> 98j"- 

Evaluation step: euaZ(Zi2) = A 2 U {[3,3] — > Q6, [5, 5] — > qs}- And as long as 
[3, 3] — > <7[ 3]3 ] and [3, 3] — ► (fc, we can merge states q\ 3 , 3 ] and qe- 
Abstraction step: we cannot apply the set of equations yet. 

Third step of completion 

One step completed automaton: we can apply the rewrite rule (B) with the 
substitution x i-t q s . So Norm(cons(qs, f(q$ + 2)) — > q 7 ), and q' 7 — > q 7 will be 
add to Merge(eval(A 3 ), q[ 3>3 ], q 6 ). 

So we have A 3 = Merge(eval(A 3 ),q [3y3] ,q 6 ) U {cons(q 8 ,qg) -> q' 7 ,q 7 -> 

97, /(910) -► 99, 9s + 9[2,2] -► 9io}- 

Evaluation step: eval(A 3 ) = A 3 U {[7, 7] — > 910}- 



Abstraction step: As long as q% + Q\2,2] ~~ *• 9io, [5,5] — > qs an d 7([5,5]) > 4, 9s 
and qio are merged according to the set of equations E. 

Fourth step of completion 

Let us see the full automaton at this step. We have Merge{eval{A^),q s „ 910)) = 
{[1,2] -> q 1 ,f(q 1 ) -> q 2 ,cons(q 1 ,q 3 ) ->■ g^^ ~> Q2,f(qi) -> 93,91 + 9[i,i] -> 
94,9[i,i] ->■ [1,1], [2, 3] -> 94, [2,2] -> 9[2,2],cons(9[ 2 ,2],95) -> 9 3 ,93 -> 93, /(%) -> 
95,9[2,2] + 9[i,i] -> 96, [3, 3] -> 9 6 ,cons(q 6 ,(j7) ->■ 93, /(9s) -> 97,9e + 9[2,2] -> 
9s, [5, 5] -► q$,cons(q 8 ,q 9 ) -> q' 7 ,q' T -> 97, /(9s) -> 99, 9s+9[ 2 ,2] ->9s,[7, 7] -» 9 8 }. 
Since the transitions have been modified thanks to the equations, we have to 
perform an evaluation step. We can nottice that evaluation of the transition 
9s + 9[2,2] —t 9s is infinite. In fact, it will add [7, 7] — > qs, [9,9] — > qs, [11, 11] — > qs, 
. . . , and so on. So we have to perform widening, that is to say, replace all the 
transitions A — y q s by [5, +oo[— > q$. 

One step completed automaton: Thanks to the widening performed at the previ- 
ous evaluation step, no more rule has to be add in the current automaton. We 
have a fixed-point which is an over-approximation of the set of reachable states, 
and the completion stops. 



6 On Improving the Verification of Java Programs by 
TRMC 

We now show how our formalism can simplify the analysis of JAVA programs. 
In [5], the authors developed a tool called Copster [7], to compile a Java . class 
file into a Term Rewriting System (TRS). The obtained TRS models exactly a 
subset of the semantic^ of the Java Virtual Machine (JVM) by rewriting a term 
representing the state of the JVM 9 . States are of the form 10 (st, in, out) 
where st is a program state, in is an input stream and out and output stream. 
A program state is a term of the form state(f ,f s,h,k) where f is current 
frame, f s is the stack of calling frames, h a heap and k a static heap. A frame is 
a term of the form frame (m,pc,s,l) where m is a fully qualified method name, 
pc a program counter, s an operand stack and t an array of local variables. The 
frame stack is the call stack of the frame currently being executed: f . For a given 
progam point pc in a given method m, Copster build a xf rame term very similar 
to the original frame term but with the current instruction explicitly stated, in 
order to compute intermediate steps. 

One of the major difficulties of this encoding is to capture and handle the 
two-side infinite dimension that can arise in Java programs. Indeed, in such 
models, infinite behaviors may be due to unbounded calls to method and object 
creation, or simply because the program is manipulating unbounded datas such 
as integer variables. While multiple infinite behaviors can be over-approximated 
with completion (just like a n b n can be approximated by a*b*), this may require 



3 essentially basic types, arithmetic, object creation, field manipulation, virtual 
method invocation, as well as a subset of the String library. 



to manipulate structure of large size. As an example, in [5], it was decided to 
encode the structure of configurations in an efficient manner, integer variables 
being encoded in Peano arithmetic. Not only that this choice has an impact on 
the size of the automata used to encode sets of configurations, but also each 
classical arithmetic operation may require the application of several rules. 

As an example, let us consider the simple arithmetic operation "300 + 400". 
By using [5], this operation is represented by xadd(succ 300 (zero), succ 400 (zero)) , 
which reduces to 5 rewriting rules detailled hereafter that have to be applied 
300 times: 

xadd(zero, zero) — > result(zero) 

xadd(succ(var(a)),pred{var(b))) — > xadd(var(a),var(b)) 
xadd(pred(var{a)),succ{yar(b))) — > xadd(var(a),var(b)) 
xadd(succ(var(a)) , succ(var(b))) — > xadd(succ(succ(var(a))), var(b)) 
xadd(pred(var(a)),pred(var(b))) — > xadd(pred(pred(var(a))) , var(b)) 
xadd(succ(var(a)), zero) — > result(succ(var(a))) 
xadd(pred{var(a)), zero) — > result(pred(var(a))) 
xadd(zero, succ(var(b))) — > result(succ(var(b))) 
xadd(zero,pred(var(b))) -> result(pred(var(b))) 

This means that if at the program point pc of method m there is a byte- 
code add then we switch to a xframe in order to compute the addition, i.e. 
a Pply frame(m,pc,s,l) — > xframe(add,m,pc,s,l). To compute the result of 
the addition of the two first elements of the stack, we have to apply the rule 
xframe(add, m,pc, stack(b(stack(a, s))), I) — > xframe(xadd(a, b),m,pc, s, I). 
Once the result is computed thanks to all the rewrite rules of xadd, we can 
compute the next operation of m, i.e. go to the next program point by applying 
x j ' rame(result(x) , m,pc, s, I) — > frame(m, next(pc), stack(x, s), I). 

The use of LTA can drastically simplify the above operations. Indeed, 
in our framework, we can encode natural numbers and operations directly 
in the alpabet of the automaton. In such context, the series of appli- 
cation of the rewritting rules is replaced by a one step evaluation. As 
an example, the rewrite rule xframe(add 1 m,pc,stack(b(stack(a,s))),l) — > 
xframe(xadd(a,b),m,pc,s,l) and rules xadd encoding addition can be re- 
placed by xframe(add,m,pc,stack(b(stack(a,s))),l) — > xframe(result(a + 
b),m,pc,s,l). Evaluation step of LTA completion will compute the result of 
addition of a + b and add the resulting term to the language of the automaton. 

Other operations such as "if-then-else" can also be drastically simplified 
by using our formalism. Indeed, with Peano numbers the evaluation of the 
condition of the instruction "if" requires several rules. As an example, the 
instruction "if a=b then go to the program point x" is encoded by the 
term ifEqint(x,a,b), and the following rules will be applied: 
if Eqint(x, zero, zero) — > ifXx(valtrue,x) 
ifEqint(x,succ(a),pred(b)) — > ifXx(valfalse,x) 
ifEqint(x,pred(a),succ(b)) —> ifXx(valfalse,x) 
ifEqint(x, succ(a), succ(b)) — > ifEqint(x, a, b) 



if Eqint(x, pred(a), pred(b)) — > ifEqint(x,a,b) 
ifEqint(x, succ(a), zero) — ¥ if 'Xx(val false, x) 
ifEqint(x,pred(a),zero) — ¥ if Xx(val false, x) 
ifEqint{x, zero, succ(b)) — > ifXx(valfalse, x) 
ifEqint(x,zero,pred(b)) — > if Xx{val false, x) 

Rules of this type will disappear with LTA because an equality between two 
elements is directly evaluated, and so are all the predefined predicates. 

In Copster, if at the program point pc of the method m we have 
an "if" where the condition is an equality between two elements, we 
switch to a xf rame where the operation to evaluate is an " if" with 
a equality condition between the two first elements of the stack, and 
which go to a program point x if the condition is true. Then we 
can apply the rule xframe(ifACmpEq(x),m,pc,stack(b,stack(a,s)),l) — > 
xframe(ifEqint(x,a,b),m,pc,s,l) which permits to compute the solution, i.e. 
calls the ifEqint rules detailed above. 

According to the result returned by these rules, we will go at program point 
x if the condition is true or else to the next program point. This is modelised 
by the two following rules: 

xframe(ifXx(valtrue, x),m,pc, s, I) — > frame(m, x, s, I) 
xframe(ifXx(valfalse,x),m,pc,s,l) — > frame(m,next(pc),s,l) 

In LTA completion, thanks to the fact that predicates are directly evaluated 
and that we have conditional rules, all this rules are replaced by the two follow- 
ing conditional rules: xframe(ifACmpEq{x),m,pc,stack(b,stack{a,s)),l) — > 
frame(m, x, s, I) <= a = b (if a = b we go to program point p) 
xframe(ifACmpEq{x),m,pc,stack(b,stack{a,s)),l) — ► frame(m,x,s,l) <= 
a =/= b (if a =/= b we go to next program point) 



7 Conclusion and Future work 

We have proposed LTA, a new extension of tree automata for tree regular model 
checking of infinite-state systems whose configurations can be represented with 
interpreted terms. One of our main contributions is the development of a new 
completion algorithm for such automata. We also give strong arguments that our 
encoding can drastically improve the verification of JAVA programs in a TRMC- 
like environment. As a future work, we plan to implement the simplifications of 
Section |5] in Copster and combine them with abstraction refinement techniques. 
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